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METHOD OF JOINT DECODING OF POSSIBLY MUTILATED CODE WORDS 

BACKGROUND OF THE INVENTION 
Field Of The Invention 

[0001] The invention relates to a method of decoding possibly 
mutilated code words of a code, wherein an information word and an 
5 address word are encoded into a code word of said code using a 

generator matrix, and wherein said address words are selected such 
that address words having a known relationship are assigned to 
consecutive code words. The invention relates further to a 
corresponding apparatus for decoding possibly mutilated code words 
10 and to a computer program for implementing said method. 

Description Of The Related Art 

[0002] In European Patent Application No. 01201841.2, 
corresponding to U.S. Patent Application Publication No. 

15 2003/0066014-A1 (PHNL 010331), the concept of coding for informed 
decoding is described. An enhancement can be found in European 
Patent Application No. 01203147.2, corresponding to U.S. Patent 
Application Publication No. 2003/0095056-A1 (PHNL 010600) . The key 
element of the inventions described in these applications is an 

20 appropriate selection of the mapping of information strings towards 
code words from the Error Correcting Code (ECC) that is applied. 
The aim of coding for informed decoders is to enable more reliable 
information retrieval if the decoder a priori knows part of the 
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encoded information. A typical example is in the field of address 
retrieval of optical media. In case of a forced jump to a certain 
sector, part of the address of the sector in which the read/write 
head will land is known taking the jump accuracy into account. For 
5 instance, in DVR (Digital Video Recording) , it is anticipated that 
on the spiral track on the optical disc, a series of messages is 
encoded which aids in the logical positioning of the read/write 
head above the disc, which, e.g., contains copyright information, 
date information and a logical track counter. Of this information, 

10 only the logical track counter changes in between successive 

messages. Therefore, a priori known information can be available 
from successful decoding of previous messages. 
[0003] According to the solutions described in the above- 
mentioned patent applications, decoding of a second message can 

15 only gain from the decoding of the first message, if said first 

decoding is successful. A second decoding can then, in turn, aid in 
the decoding of the third message, and so on. However, if the 
decoding of the first message fails, the fact that the second 
message has been encoded in a special way does not improve the 

20 error correcting capabilities, i.e., the second and any further 
decoding is not helped by previous decodings. All decodings could 
then fail. In other situations, for example, at the very start of a 
recording or playback session, not much is known about the actual 
landing place of the read/write head. In that case, only the error 
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correcting capabilities of the code can be used for retrieving 
information. 

SUMMARY OF THE INVENTION 
5 [0004] It is therefore an object of the present invention to 
provide an improved method and apparatus of decoding possibly 
mutilated code words which can be used if the error correcting 
capabilities of the code are insufficient to allow reliable 
information retrieval, and which can also be used if there are so 
10 many errors that, despite application of informed decoding as 

described in the above mentioned patent applications, these errors 
can not be corrected. 

[0005] This object is achieved, according to the present 
invention, by a method of decoding as claimed in claim 1, 
15 comprising the steps of: 

[0006] decoding the differences of a number of pairs of possibly 
mutilated code words to obtain estimates for the differences of the 
corresponding pairs of code words, 

[0007] combining said estimates to obtain a number of at least 
20 two corrupted versions of a particular code word, 

[0008] forming a code vector from said number of corrupted 
versions of said particular code word in each coordinate, 

[0009] decoding said code vector to a decoded code word in said 
code , and 
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[0010] using said generator matrix to obtain the information 
word and the address word embedded in said decoded code word. 
[0011] The present invention is based on the idea to employ 
certain relationships between consecutive code words and to jointly 
5 decode several such consecutive code words. "Consecutive" code 
words in this context shall mean code words which are read and/or 
inputted into the decoder subsequently, e.g., code words which are 
located in series next to each other in a data stream, or which are 
stored in consecutive sectors on an information carrier, such as a 

10 CD, DVD or DVR or magnetic disc. 

[0012] The proposed method of joint decoding comprises two main 
elements. A first main element comprises the step of obtaining 
estimates for the differences of various pairs of consecutive code 
words by decoding the differences of their corrupted versions. 

15 According to a second main element, the decoding results of said 
differences of corrupted pairs of consecutive code words are 
combined resulting in a number of corrupted versions of one and the 
same code word. These corrupted versions then are all used for 
obtaining the desired code word from which, finally, the 

20 information word and the address word can be retrieved. Since the 
address words encoded in the code words have a certain known 
relationship, much can be said during decoding about the difference 
of possibly mutilated code words which knowledge is advantageously 
used according to the invention during decoding. 
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[0013] Preferably, said code vector is formed by majority voting 
in each coordinate from the number of corrupted versions obtained 
from the estimates. If more than one value occurs most frequent 
among said number of corrected versions, the corresponding 
5 coordinate of said code vector is erased according to a further 
preferred embodiment. Alternatively or in addition, reliability 
information available on the symbols of one or more possibly 
mutilated code words is used for selecting the coordinates of said 
code vector according to still another preferred embodiment. 
10 Reliability information could be information about the probability 
that a certain value is correct. If available, it is preferred to 
include reliability information for each of the bits of the code 
vector for enhancing its decoding. 

[0014] A preferred way of obtaining an estimate for the 
15 difference of a pair of code words is to decode the difference of 
the corresponding pair of possibly mutilated code words to the 
closest code word from a subcode consisting of all possible 
differences of two consecutive code words of the main code which 
closest code word is then used as said estimate. 
20 [0015] The probability of incorrect decoding can be further 
reduced by introducing an extra check according to which the 
obtained estimates are checked if they show a predetermined form 
and/or have a possible value. Preferably, if this check fails, the 
decoding result should be rejected. 
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[0016] The present invention is advantageously used for decoding 
of code words stored on an information carrier, such as an optical 
or magnetic disc. According to standards used for optical 
recording, such as the CD-DA or the DVR standard, address words 
5 assigned to consecutive code words are consecutive, e.g., are 

subsequently increased by one, and preferably represent the sector 
address of the sector in which the corresponding code word is 
stored. Said relationship between the address words assigned to 
consecutive code words will be exploited by the present invention. 

10 [0017] For reducing decoding delay, it is advantageous if the 
number of pairs of possibly mutilated code words to be decoded is 
as small as possible. However, at least two pairs of two 
consecutive possibly mutilated code words have to be decoded so 
that at least three estimates can be obtained, since particularly 

15 majority voting on two alternatives is not useful. If reliability 
information is used instead of majority voting, one pair might be 
sufficient . 

[0018] According to another aspect of the present invention, it 
is proposed that in said step of combining said estimates to obtain 

20 a number of corrupted versions of a particular code word, a first 
corrupted version corresponds to a first possibly mutilated code 
word, a second corrupted version corresponds to the difference 
between a second possibly mutilated code word and a first estimate, 
obtained by decoding the difference between said first and said 

25 second possibly mutilated code words, and a third corrupted version 
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corresponds to the difference between a third possibly mutilated 
code word, said first estimate and a second estimate, obtained by 
decoding the difference between said second and said third possibly 
mutilated code words. 
5 [0019] According to still another aspect of the invention, the 
proposed solution could also be used in combination with the 
solutions described in the above-mentioned patent applications 
using a priori known information during decoding, A preferred way 
could be that, in a first step, the decoder tries to decode a 

10 possibly mutilated code word using a priori known information 

available on the address word embedded in said possibly mutilated 
code word. If this decoding fails or if the result is not reliable 
enough, then the present solution of joint decoding could be used. 
However, it is also possible that the present solution is always 

15 used in addition to the use of a priori known information during 
decoding . 



BRIEF DESCRIPTION OF THE DRAWINGS 



[0020] 



The invention will now be explained more in detail with 



20 



reference 



[0021] 



to the drawings, in which: 

Fig. 1 shows the format of a data word information to be 



encoded ; 



[0023] 



[0022] 



Fig. 2 shows the format of a code word; 

Fig. 3 shows a block diagram of an encoding and decoding 



25 scheme; and 
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[0024] Fig, 4 shows a block diagram of the method of decoding 
according to the present invention. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 
5 [0025] In the following, it is assumed that information is 

represented as a k-bits string called data word d as shown in Fig. 
1, said data word d comprising an address word a(i) and an 
information word m. The address word a(i) is a b-bits string 
representing the sector address for sector i; the information word 

10 m is a (k-b) -bits string containing any information to be stored, 
such as audio, video, software, copyright or date information or 
any other kind of data. It should be noted that the address word 
a(i) is known if i is known and vice versa; however, knowledge of i 
does not give information on the information word m. 

15 [0026] The data word d shown in Fig. 1 is encoded using a k x n 
binary generator matrix G so that the data word d=(a(i), m) is 
mapped on the n-bits code word c(i, m)=(a(i), m)G as shown in Fig. 
2. 

[0027] Fig. 3 shows a block diagram of a typical system using 
20 encoding and decoding. Therein user data, e.g., audio or video 

data, coming from a data source 1, e.g., recorded on a master tape 
or master disc, are encoded before they are stored on a data 
carrier, e.g., a disc, or transmitted over a transmission channel, 
e.g., over the Internet, before they are again decoded for 
25 forwarding them to a data sink 9, e.g., for replaying them. 
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[0028] The user data of the source 1 are first encoded by a 
source encoder 2, then error correction encoded by an ECC encoder 
3, and thereafter modulated by a modulator 4, e.g., an EFM 
modulator, before the encoded user data - the code words - are put 
5 on the channel 5 on which errors may be introduced into the code 
words. The term "channel" 5 is here interpreted broadly, and 
includes a transmission channel as well as storage of the encoded 
data on a data carrier for a later replay. 

[0029] When replay of data is intended, the encoded data first 
10 have to be demodulated by a demodulator 6, e.g., an EFM 

demodulator, before they are error correction decoded by an ECC 
decoder 7 and source decoded by a source decoder 8. Finally, the 
decoded user data can be input to the sink 9, e.g., a player device 
for replay of the user data. 
15 [0030] The method of decoding according to the present invention 
shall be explained more in detail with reference to Fig. 4. It 
shall be assumed that data stored in encoded form on the record 
carrier 10 shall be replayed. In a first step, an amount of data r 
are read by a reading unit 11 and forwarded to an encoding 
20 apparatus 12. During its way from the encoder to the decoder, 

errors might be introduced into code words, e.g., by scratches on 
an optical record carrier or by transmission errors, so that the 
read code words r are possibly mutilated. Those errors shall be 
corrected by the decoder 12 . 
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[0031] At first, the difference D of code words situated in 
consecutive sectors (or located consecutively in a transmitted data 
stream) is computed in unit 13 . The difference D of code words ri 
and r^ + i in the sectors i and i+1 is computed as follows: 
5 c(i, m^ecfi+l, m 2 ) = (Mi), m 1 ©m 2 )G / 

wherein A (i) =a (i) 0a (i+1) for 0<i<2 b -2. The key observation is that 
for appropriate choices of the address word a # much can be said of 
the difference A(i) of two consecutive address words. It should be 
noted that 0 indicates a modulo -2 operation in which additions and 
10 subtractions have the same result. 

[0032] Assuming that corrupted versions r 1# r 2 , r 3 of L = 3 
consecutive code words c^, c 2 , C3 are read, one can write for j = 
1, 2, 3: 

rj = c(i+j-l, mj)©ej = (a(i), m-j)G©ej, 
15 ej representing an error vector. It is clear that 

D 12 = r x 0r 2 = (a(i), m x ) G0 (a (i+1) , m 2 )G0(e!0e 2 ) 
= (A(i), m 1 0m 2 )G0(e 1 0e 2 ) , and 

D 23 = r 2 0r 3 = (A (i+1), m 2 0m 3 ) G© (e 2 0e 3 ) . 
These L-l = 2 differences D 12 and D 23 are computed by unit 13 and 
20 inputted into the first decoding unit 14. 

[0033] In the decoding unit 14, said differences D 12 , D 23 are 
decoded each to the closest code word from a subcode C" which 
consists of all possible differences of two consecutive code words 
c of the main code C. This could, e.g., be done by comparing the 
25 differences D 12/ D 23 with all possible code words of the subcode C 
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and by selecting the closest code word as estimate u for c(i # 
m 1 )©c(i+l / m 2 ) and as estimate v for c(i+l, m 2 )©c(i+2, m 3 ) . Thus, 
in the first decoding unit 14, estimates u, v for the differences 
of the pairs of code words ci©c 2 and c 2 ©c 3 , corresponding to the 
5 pairs of possibly mutilated code words r!©r 2 and r 2 ©r 3 , are 
obtained . 

[0034] These estimates u, v are combined in unit 15 by computing 

w l 2 = r l = mi)©ei 
w 2 : = r 2 ©u= ^©(^©^©u) 
10 w 3 : = r 3 ©u©v = r!© (r!©r 2 ©u) © (r 2 ©r 3 ©v) • 

If the estimate u is correct, then r x ©r 2 ©u = e x ©e 2 . Similarly, if 
v is correct, then r 2 ©r 3 ©v = e 2 ©e 3 . Hence, if the estimates u and 
v both are correct, then 

W! = c(i, mi)©ei 
15 w 2 = c(i, m 1 )©e 2 

w 3 = c(i, mi)©e 3 . 
In combining unit 15, a number L = 3 of corrupted versions w 1# w 2 , 
w 3 of the particular code word ci = c(i, mi) are thus obtained. 
[0035] Next, in unit 16, the code vector z is constructed by 
20 component -wise majority voting of the corrupted versions w 1# w 2 , w 3 
of the code word c^ That is, for each i e {l, 2, n}, the i-th 
component z± of the code vector z is an erasure, if w 2 i and 

w 3 i are distinct; otherwise, the component z± equals the most 
frequent element among w^, w 2 i, w 3 i- The code vector z is then 
25 decoded by a second decoding unit 17 for the code C, the second 
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decoding unit 17 decoding the code vector z into a code word c" of 
said code C. Finally, in unit 18, the generator matrix G, which had 
been used by the encoder to encode the address words and the 
information words into code words, is used to finally retrieve the 
5 information word m and the address word a embedded in said code 
word c " . 

[0036] In general, a number of corrupted versions of L 
consecutive code words are read, say rj = c(i+j-l, mj)©ej for j = 
1, 2, L. Estimates for the differences of each of the (L-l) 

10 pairs of consecutive code words are obtained. By combining these 

estimates, L corrupted versions w^, W2, wl of the code word ci = 
c (i, m^) are obtained. If all the estimates are correct, then it 
holds wj = c(i, m^Sej for j = 1, 2, L . The code vector z is 
obtained as the majority vote of w^, wl in each of the 

15 coordinates. If, in a certain coordinate, more than one symbol 
occurs most frequent, this coordinate in the code vector z is 
erased. Finally, the code vector z is decoded to a code word in the 
code C. 

[0037] For reducing decoding delay, it is advantageous if the 
20 number L is as small as possible. With L = 2, the described method 
is not appropriate, as majority voting on two alternatives is not 
useful. If reliability information, also known as soft decision 
information, is available on the bits of the possibly mutilated 
code words r^ and r 2 , reliability information for each of the bits 
25 of w 2 : = r 2 ©u can be obtained according to a well-known method. The 



PHNL020183-SS-031806 



PHNL 020183 

code vector z can now, instead from majority voting, be obtained by 
setting the coordinates z± to the most reliable of the bits from 
r^i and W2i- For enhancing the decoding of the code vector z, 
reliability information could be included for each of the bits of 
5 the code vector z. The reliability information for bit i in z is 
obtained by combining the reliability information of the bits rii 
and W2i . 

[0038] Next, two special cases shall be briefly discussed. In a 
first special case, it is considered that the information word a(i) 

10 is the conventional k-bits binary representation of the integer i. 
For example, if k = 8, then a(57) = 00111001, as 57 = 
0*2 7 +0*2 6 +l*2 5 +l*2 4 +l*2 3 +0*2 2 +0*2 1 +l«2 0 . The binary representation 
of two consecutive integers nearly always has the same leftmost 
bit; the only exception is the address 011-1 that has 10...0 as a 

15 successor. Therefore, it can be assumed, with only a very small 

probability of being wrong, that the leftmost bit of the difference 
of two consecutive address words A(i): = a(i)©a(i+l) equals zero. 
More generally, it will be shown that it is very likely that A(i) 
starts with many zeros. In other words, a solution as proposed in 

20 the above mentioned European patent applications EP 01201841.2 and 
EP 01203147.2 of using a priori known information in the decoder, 
can be applied since it is known that a lot of the leftmost 
information bits of A(i) are (with a high probability) equal to 
zero. 

25 [0039] Let i be an integer between 0 and 2 b " 2 . Let j be the 
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number of ones in which (a)i ends, and further 0 ^ j ^ b-1. One can 
write (a)i = sOlj, where s has length b-j-1, and ID denotes a 

■ 

string of j ones. It can be easily gathered that a(i+l) = slOD, and 
A(i) = a(i+l)©a(i) = O^'l' 1 !^^ 1 . 
5 [0040] The following conclusions can be drawn: 

a) for each ie{0, 1,..., 2 b -2}, A(i) is of the form 0 b " m l m for some 
me{l, 2, ...,b}. 

b) A(i) = 0 b_ m l m if and only if a(i) ends in (m-1) ones. 
[0041] From conclusion b) , it follows that the number of 

10 integers i for which A(i) starts with b-m zeros and ends in m ones 
equals 2b-m. Stated differently, for m>l, the fraction of integers 
ie{0,l, 2 b -2} for which A(i) ends in exactly m ones equals 2 b ~ 
m /(2 b -l)*(^) m . 

[0042] For example, the fraction of integers i for which A(i) 
15 ends in, at most # 4 ones approximately equals 1/2+1/4+1/8+1/16 = 

15/16 = 0.9375. That is, if it is assumed that A(i) starts with b-4 
zeros, one is correct in nearly 94% of the cases. If it is assumed 
that A(i) starts with b-8 zeros, the minimum Hamming distance drops 
when the idea of using a prior known information symbols in the 
20 decoder, but the assumption is correct with a much larger 
probability of 255/256 * 0.9961. 

[0043] After decoding a corrupted version of (A(i), m 1 0m2)G, 
i.e., after decoding the difference D 12 , it should be checked if 
the purported value for A(i) is of the form 0 n " m l m for some m^l. 
25 If not, the decoding result should be rejected. This extra check 
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greatly reduces the probability of incorrect decoding, e.g., 
copyright information. According to another special case all 
sectors have the same information word m. The difference between 
the code words for the sectors i and i+1 can be computed as follows 
5 c(i, m)©c(i+l, m) = (A(i), 0)G. 

[0044] In other words, the difference of any two consecutive 
code words is in the code C A , defined as 

C A ={(A(i), 0)G|0^i<2 b -2} . 
[0045] The difference of corrupted versions of two consecutive 

10 code words can therefore be decoded to the code C A . The code C A , 

which is a subcode of the main code C, has small cardinality if the 
set of difference vectors {A (i) | 0<i<2 b -2 } has small cardinality, 
and in that case, its minimum distance may well exceed the minimum 
distance of the code C. 

15 [0046] In the special case that the address word a(i) is the 
conventional binary representation of i, it holds 

C A ={(0i, lb" 1 , 0 k " b )G|0<i<b-l}. 
Consequently, the subcode C A only contains b words, but not 2 b -l 
words, and optimal decoding can easily be performed by comparing 

20 the difference of two consecutive possibly mutilated words r with 
all b words from the subcode C A . It should be noted that the number 
of comparisons to be made is linear, not exponential, in b. 
[0047] Since A(i) does not usually end in many ones, it might be 
considered to decode to an even smaller subcode, namely 

25 <T A = {(Qi, l b "i, Qk- b )G | b'< i < b-l} 
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where b' is an integer between 0 and b-1. The larger b", the 
smaller C" A , but also the smaller the likelihood of correct 
decoding . 

[0048] Also in case that the address representation a 
5 corresponds to binary Gray encoding, the subcode only has b 
elements. This is because, by definition, binary Gray encoding 
means that two consecutive addresses only differ in one position, 
that is, for each i, A(i) consists of one 1 and b-1 zeros. 
[0049] The present invention constitutes an effective and 

10 reliable method for retrieving information stored in code words 

situated in several consecutive sectors or transmitted subsequently 
in a data stream. It employs certain relationships between 
consecutive code words and jointly decodes several such consecutive 
code words. The present solution can be applied in any encoding and 

15 decoding system where address words having a known relationship are 
assigned to consecutive code words. 
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